Applied AI Example: Literature Searching & Automating Reference Manager with Claude

By:
Alexander Scheer, CISSP, CISM, AAISM, QTE 501
Mantle AI, Inc.

In this post, I’ll walk through two related workflows:

  1. Using Claude and MCP to make literature search less painful, especially when working across open-access sources.

  2. Using Claude Code with a reference manager to maintain, update, and extract more value from a research corpus. In my case, that reference manager is Zotero, though the same pattern should translate to other tools. There are also open-source MCP projects for Mendeley and EndNote; I couldn’t find one for ReadCube. The configuration will differ, but the underlying concept is the same.

Where this is relevant:

This is most useful in disciplines where research is a core part of the job and where the volume of material starts to benefit from LLM and agentic capabilities: synthesis, abstraction across APIs and sources, pattern identification, and the ability to reason across messy data sets.

For years, I’ve been frustrated by the inefficiency of literature searching: finding papers, downloading them, tagging them, keeping them sorted in my reference manager, and then later going back to figure out what I saved, why I saved it, and which papers are actually relevant to a particular topic.

My search process has also always been frustrating. The results are inconsistent, and I often do not find what I’m looking for because, in part, I can’t describe my inquiry in natural language. Or, more accurately, I can describe it in natural language — just not in the exact combination of terms, Boolean operators, and database-specific query magic required to obtain a useful result.

On one hand, this is almost certainly a “me” problem.

On the other hand, the beauty of AI is that it lets you spend more time on the meta-layer: the goal you’re trying to achieve, the problem you are trying to solve, the question you’re trying to answer, and the output you would like. This translates to spending less time on non-differentiated activities like fine-tuning a laborious workflow, wrangling custom Python scripts, and convincing myself that my time is being time spent wisely on this activities.

I’ve been experimenting with two tools that seem to augment my specific working pattern, with Zotero serving as my reference manager:

  • MCP for Zotero

  • paper-search-mcp

Having now combined them with a simple workflow in Claude, I’m starting to see real efficiencies — and, more interestingly, a new way to breathe life into the research corpora I’ve been building in Zotero: the pile of papers, notes, websites, PDFs, videos, and other “I should come back to this later” material.

I know this will have utility for others, particularly in fields like biotech, where we almost exclusively work.

So, this is a writeup of how the workflow works for me, including how it enabled an actual literature search inquiry I had and helped me get the discovered content into my reference manager.

My Prompt

“I’m looking for papers related to the latest applied thinking on ICH E6 R3 as a clinical-trial Sponsor with a consortium of third-party partners — CROs, clinical sites, regulatory consultants, and the like. To the extent possible, find papers covering wearables. I’m looking for at least 5 papers.”

The tools

Both of these connect to Claude through the Model Context Protocol (“MCP”), which is an open standard that lets an AI assistant reach outside tools and data without anyone having to mess with each connection through progammatic interfaces ("APIs"). The built-in connectors in Claude all use MCP.

The first, paper-search-mcp, is an open-source project built on a free-first principle, which is to say it leans on open and public sources before it reaches for anything paid.

What is great about it is that I can give it a single query in natural language, and it fans that query out across more than twenty open scholarly sources at once, among them arXiv, PubMed, Crossref, OpenAlex, Europe PMC, bioRxiv, and Semantic Scholar, then it deduplicates whatever comes back and pulls the open-access full text wherever a lawful link exists.

Note: See the end of this blog for the matrix from the paper-search-MCP repository on what’s supported.

The second, the Zotero connector, is a hosted proxy that sits in front of the Zotero API, and by the author's own description it holds nothing beyond encrypted credentials. Once it is connected, Claude can search my library, drop items into a specific collection, tag them, write notes against them, and pull citations back out.

This is where I started to see additional value once the paper searching had completed. I feel it really has unlocked a new dimension to the data in my reference manager, including how to get information in there, tagged, and annotated.

The sample workflow

The workflow around the two tools is simple...

  1. Claude Desktop: Write prompt geared around your paper search inquiry. I recommend turning off other MCP connectors and only leaving the paper-search-mcp connector enabled as Claude tends to try other tools when hitting roadblocks.

  2. Claude Code: Copy the table into Code and prompt it to “verify links/DOI references, download, import into [name of] collection, tag, and annotate with notes”.

Now, I recognize this could be integrated through Claude Cowork, bridging both connectors, but I personally prefer the terminal a lot of my AI work, so that is a style choice I’ve made. Conversely, I could also exclusively use Claude Code for the paper-search-mcp component, but I was interested in trying it in the Claude Desktop console. There is no single answer in this AI landscape - it is truly about experimenting and following what works for you.

Claude Desktop - Tabular File

That table then goes into Claude Code, and from there the Zotero connector does the admin part:
a) verify DOIs with WebFetch tool
a)pulling each paper
b) filing it into the right collection
c) tagging it
d) writing notes into the entries

Example below from Claude Code after it imported all items into my GCP collection:

And this is the data from the Zotero perspective:

What else the Zotero connector is capable of?

Core Capabilities

Search Your Zotero Library

MCP for Zotero can search across your Zotero library using natural-language requests.

It can help find items by:

  • Title

  • Author

  • Topic

  • Year

  • Tag

  • Collection

  • PDF content, where indexed by Zotero Desktop

Add New References

The tool can add new items to Zotero through natural-language descriptions.

Instead of manually entering metadata, the assistant can help look up and create records for:

  • Journal articles

  • Books

  • Theses

  • Papers

  • Other reference types supported by Zotero

It can also help with DOI lookup and metadata population.

Organize Collections

MCP for Zotero can help create and manage Zotero collections.

It can support workflows such as:

  • Creating a new collection

  • Adding items to a collection

  • Moving papers into a specific research folder

  • Organizing references around a topic or project

Manage Tags

The tool can apply and update tags on Zotero items.

This is useful for building more structured research libraries around themes, source types, or review status.

Update Metadata

MCP for Zotero can help update item metadata in Zotero.

This can include cleaning up or enriching records with better information, such as:

  • Title

  • Author

  • Publication year

  • DOI

  • Item type

  • Publication venue

  • Tags

  • Notes

This is particularly useful when references are imported with incomplete or messy metadata.

Export Citations and Bibliographies

The tool can export citations and bibliographies from Zotero.

Supported export examples include:

  • BibTeX

  • RIS

  • APA

  • Chicago

  • Other formatted citation styles supported through Zotero

Work with Group Libraries

MCP for Zotero can access Zotero group libraries in addition to a personal library.

This is useful for collaborative research workflows where a shared Zotero library is used by a team, lab, company, or project group.

Search Inside PDFs

Where Zotero Desktop has indexed PDF content, the tool can search inside PDFs.

This makes it possible to ask for papers based not only on citation metadata, but also on concepts or language contained in the actual paper text.

APPENDIX A - Getting Started

Zotero MCP Connector

  1. Go to the creator’s site: https://mcpforzotero.alejandroarnaud.dev/

  2. Sign-up with Google OAuth (asks for basic profile information). If you find you don’t want to continue using tools, you can clean-up the OAuth authorization in your Google account here: https://myaccount.google.com/connections

  3. Create the Zotero token: https://www.zotero.org/settings/security#applications. Click on “Create a new private key”

  4. Add the token to the configuration screen on mcpforzotero.com: “Connect your Zotero library”

  5. Create the access tokens for Claude - one for Desktop, one for Code: “Access Tokens”

  6. Copy the generated configurations below the “Access Tokens” section for your applicable terminal - Claude Desktop, Code, ChatGPT

  7. For Claude Desktop, you can find the configuration file here:
    OSX: ~/Library/Application Support/Claude/claude_desktop_config.json
    Windows: %APPDATA%\Claude\claude_desktop_config.json

  8. For Claude Code, you can find the configuration file here:
    ~/.claude.json
    or, at your project root for a given project: mcp.json

  9. Close Claude and re-open it

Paper Search MCP

  1. Load as a Claude Code skill:

    Step 1 — Clone the repo:

    git clone https://github.com/openags/paper-search-mcp.git ~/paper-search-mcp

    Step 2 — Install the skill:

    mkdir -p ~/.claude/skills/paper-search
    cp ~/paper-search-mcp/claude-code/SKILL.md ~/.claude/skills/paper-search/SKILL.md

    Step 3 — Update the repo path in the skill:

    Edit ~/.claude/skills/paper-search/SKILL.md and replace every <REPO_PATH> with the absolute path to your clone (e.g. /Users/yourname/paper-search-mcp).

  2. Install Using Smithery and Node

    npx -y @smithery/cli install @openags/paper-search-mcp --client claude

    Smithery automatically writes the correct config block for you. No manual JSON editing needed.

APPENDIX B - Paper Search MCP

Platform Capability Matrix

This matrix reflects verified live-integration results from functional and end-to-end regression tests in this repository. Columns show the highest capability level observed under normal conditions.

Platform Search Download Read Notes
arXiv Open API; reliable
PubMed ⚠️ info-only Open API; reliable
bioRxiv Open API; reliable
medRxiv Open API; reliable
Google Scholar ⚠️ Bot-detection active; set PAPER_SEARCH_MCP_GOOGLE_SCHOLAR_PROXY_URL
IACR Open API; reliable
Semantic Scholar ✅ (OA) ✅ (OA) Works without key (rate-limited); key improves limits; key rejection (403) retried automatically without key
Crossref ⚠️ info-only Open API; reliable
OpenAlex ⚠️ info-only Open API; reliable
PMC ✅ (OA only) ✅ (OA only) OA PDFs only; direct download may be blocked by some proxy environments
CORE ✅ (record-dependent) ✅ (record-dependent) Free key recommended; connector retries with backoff and falls back to key-less on 401/403
Europe PMC ✅ (OA) ✅ (OA) OA PDFs only; direct download may be blocked by some proxy environments
dblp ⚠️ info-only Open API; reliable
OpenAIRE Open API; retries 3× with escalating request profiles on transient 403
CiteSeerX ⚠️ ✅ (record-dependent) ⚠️ API endpoint intermittently unavailable / redirects to web archive
DOAJ ⚠️ (URL-dependent) ⚠️ (URL-dependent) PDF availability varies by article; free key raises rate limits
BASE ⚠️ ✅ (record-dependent) ✅ (record-dependent) OAI-PMH endpoint requires institutional IP registration; returns empty gracefully otherwise
Zenodo ✅ (record-dependent) ✅ (record-dependent) Open API; reliable
HAL ✅ (record-dependent) ✅ (record-dependent) Open API; reliable
SSRN ⚠️ ⚠️ best-effort ⚠️ best-effort 403 bot-detection active; public PDF only
Unpaywall ✅ (DOI lookup) Requires PAPER_SEARCH_MCP_UNPAYWALL_EMAIL
Sci-Hub (optional) ⚠️ fallback-only Optional; unstable mirrors; user responsibility
IEEE Xplore 🔑 🚧 skeleton 🚧 skeleton 🚧 skeleton Requires PAPER_SEARCH_MCP_IEEE_API_KEY to activate
ACM DL 🔑 🚧 skeleton 🚧 skeleton 🚧 skeleton Requires PAPER_SEARCH_MCP_ACM_API_KEY to activate

✅ = reliable in live tests. ⚠️ = works but subject to upstream instability or access restrictions. ❌ = not supported. 🔑 = key required. 🚧 = skeleton only.


Credential & API Key Requirements

All keys are optional unless noted. Configure them in .env (preferred) or as shell exports.

Environment Variable Provider Required? How to obtain
PAPER_SEARCH_MCP_UNPAYWALL_EMAIL Unpaywall Yes (Unpaywall disabled without it) Any valid email; register at unpaywall.org
PAPER_SEARCH_MCP_CORE_API_KEY CORE Recommended Free at core.ac.uk/services/api
PAPER_SEARCH_MCP_SEMANTIC_SCHOLAR_API_KEY Semantic Scholar Optional Free at semanticscholar.org — improves rate limits
PAPER_SEARCH_MCP_GOOGLE_SCHOLAR_PROXY_URL Google Scholar Optional Your HTTP/HTTPS proxy URL — bypasses bot-detection
PAPER_SEARCH_MCP_DOAJ_API_KEY DOAJ Optional Free at doaj.org — raises hourly rate limit
PAPER_SEARCH_MCP_ZENODO_ACCESS_TOKEN Zenodo Optional Free at zenodo.org — required for private records
PAPER_SEARCH_MCP_IEEE_API_KEY IEEE Xplore Required to activate Free at developer.ieee.org
PAPER_SEARCH_MCP_ACM_API_KEY ACM DL Required to activate See libraries.acm.org/digital-library/acm-open

All variables follow the PAPER_SEARCH_MCP_<NAME> prefix scheme. Legacy names without the prefix (e.g. CORE_API_KEY, UNPAYWALL_EMAIL) are still supported for backward compatibility.


Known Upstream Limitations

Some search failures are caused by external provider instability, not by bugs in this project:

Source Symptom Cause Workaround
Google Scholar Returns 0 results / empty HTML Bot-detection (CAPTCHA) Set PAPER_SEARCH_MCP_GOOGLE_SCHOLAR_PROXY_URL to a proxy
Semantic Scholar 429 rate-limited responses Anonymous access rate limit Set PAPER_SEARCH_MCP_SEMANTIC_SCHOLAR_API_KEY; if key is rejected (403) connector automatically retries without key
CORE 500 / timeout errors Unauthenticated rate limiting Set PAPER_SEARCH_MCP_CORE_API_KEY (free); connector retries with exponential backoff and falls back to key-less on 401/403
OpenAIRE Transient 403 responses IP-based session rate limiting Connector retries 3× per profile, escalating: plain session → XML Accept header → raw requests.get with Mozilla UA
CiteSeerX 404 via web archive redirect PSU endpoint intermittently redirects to archive No workaround; connector returns empty gracefully
BASE Search returns 0 results OAI-PMH endpoint requires institutional IP registration Register at base-search.net for API access; connector returns empty gracefully otherwise
SSRN HTTP 403 Bot-detection (Cloudflare) No workaround; connector tries two endpoints and returns a clear message on failure
PMC / Europe PMC PDF download ProxyError Local proxy blocking direct HTTPS PDF download Disable proxy or use download_with_fallback instead
Unpaywall Skipped entirely UNPAYWALL_EMAIL env var not set Set PAPER_SEARCH_MCP_UNPAYWALL_EMAIL in .env

Optional Paid Platform Connectors (Phase 3)

IEEE Xplore and ACM Digital Library connectors are included as opt-in skeletons. They are disabled by default — no API calls are made unless you explicitly configure the corresponding keys.

Platform Env Var Status
IEEE Xplore PAPER_SEARCH_MCP_IEEE_API_KEY 🚧 skeleton — search registered, download/read raise NotImplementedError
ACM Digital Library PAPER_SEARCH_MCP_ACM_API_KEY 🚧 skeleton — search registered, download/read raise NotImplementedError